On the Discrepancy Measures for the Optimal Equal Probability Partitioning in Bayesian Multivariate Micro-Aggregation

نویسندگان

  • George Kokolakis
  • Dimitris Fouskakis
چکیده

Data holders, such as statistical institutions and financial organizations, have a very serious and demanding task when producing data for official and public use. It’s about controlling the risk of identity disclosure and protecting sensitive information when they communicate data-sets among themselves, to governmental agencies and to the public. One of the techniques applied is that of micro-aggregation. In a Bayesian setting, micro-aggregation can be viewed as the optimal partitioning of the original data-set based on the minimization of an appropriate measure of discrepancy, or distance, between two posterior distributions, one of which is conditional on the original data-set and the other conditional on the aggregated data-set. Assuming d-variate normal data-sets and using several measures of discrepancy, it is shown that the asymptotically optimal equal probability m-partition of IR, with m ∈ IN, is the convex one which is provided by hypercubes whose sides are formed by hyperplanes perpendicular to the canonical axes, no matter which discrepancy measure has been used. On the basis of the above result, a method that produces a sub-optimal partition with a very small computational cost is presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Determination of Maximum Bayesian Entropy Probability Distribution

In this paper, we consider the determination methods of maximum entropy multivariate distributions with given prior under the constraints, that the marginal distributions or the marginals and covariance matrix are prescribed. Next, some numerical solutions are considered for the cases of unavailable closed form of solutions. Finally, these methods are illustrated via some numerical examples.

متن کامل

Bayesian Econometrics Approach in Determining of Effecting Factors on Pollution in Developing Countries (based on Environmental Performance Index)

Emphasis on sustainable development and the need to protect the environment as well as the adverse effects of environmental pollution on the quality of life have made environmental protection one of the main concerns of economic policymakers. For this purpose, approaches to improve the quality of the environment and the factors affecting it have triggered extensive theoretical and empirical stu...

متن کامل

Bayesian Econometrics Approach in Determining of Effecting Factors on Pollution in Developing Countries (based on Environmental Performance Index)

Emphasis on sustainable development and the need to protect the environment as well as the adverse effects of environmental pollution on the quality of life have made environmental protection one of the main concerns of economic policymakers. For this purpose, approaches to improve the quality of the environment and the factors affecting it have triggered extensive theoretical and empirical stu...

متن کامل

An Optimal Bayesian Network Based Solution Scheme for the Constrained Stochastic On-line Equi-Partitioning Problem

A number of intriguing decision scenarios revolve around partitioning a collection of objects to optimize some application specific objective function. This problem is generally referred to as the Object Partitioning Problem (OPP) and is known to be NP-hard. We here consider a particularly challenging version of OPP, namely, the Stochastic On-line Equi-Partitioning Problem (SO-EPP). In SO-EPP, ...

متن کامل

Time series forecasting of Bitcoin price based on ARIMA and machine learning approaches

Bitcoin as the current leader in cryptocurrencies is a new asset class receiving significant attention in the financial and investment community and presents an interesting time series prediction problem. In this paper, some forecasting models based on classical like ARIMA and machine learning approaches including Kriging, Artificial Neural Network (ANN), Bayesian method, Support Vector Machine...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Classification

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2008